Add external data coordinates within the scripts
using the tribble() function.
Add text and labels to ggplot maps using the
{ggsflabel} package.
Add color palettes to spatial data using the
{colorspace} package.
Add arrow and scale annotations in ggplot maps
using the {ggspatial} package.
This lesson requires the following packages:
if(!require('pacman')) install.packages('pacman')
pacman::p_load(tidyverse,
colorspace,
ggspatial,
janitor,
ggplot2,
spData,
units,
sf)
pacman::p_load_gh("yutannihilation/ggsflabel",
"afrimapr/afrihealthsites",
"afrimapr/afrilearndata",
"wmgeolab/rgeoboundaries")Until now, we have learnt general concepts about geospatial visualization in independent lessons.
The modular approach of ggplot2 allows
to successively add all of them in different layers.
For instance, study sites or administrative delineations, and a visual
representation of their characteristics in a single map, using useful
color palettes.
These enriched thematic maps also require to contain text and labels referring to specific places or regions, and important map elements like scale bars and a north arrow, as will be illustrated in this part.
Figure 1. Ggplot map with multiple layers.
We will create a map to figure out how are Hospitals distributed in the Western Province of Sierra Leone. For this, we are going to simulate the location of two Hospital manually collected in the field, and also retrieve real Hospital information from a public repository.
tribble()We start by defining two Hospital sites (point data), according to
their longitude and latitude, collected by GPS devices in the field,
using the tribble() function from the {tibble}
package:
sites <- tribble(~gps_name, ~gps_longitude, ~gps_latitude,
"site A", -13.1491617, 8.1920813,
"site B", -13.1066807, 8.2180983
)
sites## # A tibble: 2 × 3
## gps_name gps_longitude gps_latitude
## <chr> <dbl> <dbl>
## 1 site A -13.1 8.19
## 2 site B -13.1 8.22
Create a tibble object with three columns (gps_name,
gps_longitude, gps_latitude) with the GPS
location of one "household" with longitude
-13.1856942 and latitude 8.2851963. Use the
tribble() function.
tribble is better used for a minimum amount of
observations. For larger amounts you may prefer to read
your data from a csv or excel file.
These sites belong to Sierra Leone:
sierra_leone <- geoboundaries(country = "Sierra Leone", adm_lvl = 1)To add these geographic coordinates to a country map, we need to use
the power of sf.
Converting the data frame to a sf object allows to rely
on sf to handle on the fly the coordinate
system (both projection and extent), which
can be very useful if the two objects (here sierra_leone,
and sites) are not in the same projection.
To achieve this, the projection (here WGS84, which
is the CRS code #4326) has to be a priori defined in the
sf object with the st_as_sf() function
(for more information: see previous lessons) :
sites_sf <- sites %>%
st_as_sf(coords = c("gps_longitude","gps_latitude"),
crs = 4326)Now we can make a map with sierra_leone and
sites_sf objects, and add the type of data
(gps_name) in the legend:
ggplot() +
geom_sf(data = sierra_leone) +
geom_sf(data = sites_sf, mapping = aes(color = gps_name))We can use coord_sf() after all geom_sf()
calls, to zoom in to our area of interest inside Sierra
Leone:
ggplot() +
# geometries
geom_sf(data = sierra_leone) +
geom_sf(data = sites_sf, mapping = aes(color = gps_name)) +
# map extent
coord_sf(xlim = c(-13.5,-12.7), ylim = c(8.0,8.7))As such, we can adjust all characteristics of points
(e.g. color of the outline and the filling, shape, size, etc.), for each
geom_sf() layer. In this example, we set the two points as
filled diamonds (shape = 18) with a bigger size
(size = 2):
ggplot() +
# geometries
geom_sf(data = sierra_leone) +
geom_sf(data = sites_sf, mapping = aes(color = gps_name),
shape = 18, size = 4) +
# map extent
coord_sf(xlim = c(-13.5,-12.7), ylim = c(8.0,8.7))geom_sf_label_repel()It would be informative to add finer administrative information on
top of the previous map, starting with administrative borders
(sf object: polygon data) and their names. The package
ggsflabel provides functions to add labels to
sf objects (points and polygons) in maps.
Province names are part of this data, as the shapeName
variable.
sierra_leone %>%
ggplot() +
geom_sf() +
geom_sf_label_repel(aes(label=shapeName))The zimbabwe_adm1 object contains the boundaries of all
the provinces in Zimbabwe.
zimbabwe_adm1 <- geoboundaries(country = "Zimbabwe", adm_lvl = 1)Create a map of Zimbabwe with labels for the name of each of its provinces.
To continue building the complex map introduced at the beginning of
the lesson, province data is directly plotted as an additional
sf layer using geom_sf(). In addition,
province names will be added using geom_sf_label_repel(),
as well as the label (from shapeName), and a relatively big
font size.
ggplot() +
# geometries
geom_sf(data = sierra_leone) +
geom_sf(data = sites_sf, mapping = aes(color = gps_name),
shape = 18, size = 2) +
# labels
geom_sf_label_repel(data = sierra_leone, mapping = aes(label=shapeName)) +
# map extent
coord_sf(xlim = c(-13.5,-12.7), ylim = c(8.0,8.7))We can drop the province name of “Eastern”. For this, we can use
filter() from the {dplyr} package:
ggplot() +
# geometries
geom_sf(data = sierra_leone) +
geom_sf(data = sites_sf, mapping = aes(color = gps_name),
shape = 18, size = 2) +
# label
geom_sf_label_repel(data = sierra_leone %>% filter(shapeName!="Eastern"),
mapping = aes(label=shapeName)) +
# map extent
coord_sf(xlim = c(-13.5,-12.7), ylim = c(8.0,8.7)){colorspace}Districts (polygon data) can be retrieved from local shapefile data. This time, only districts from Western province are retained:
sierra_leone_west <-
sf::read_sf(dsn = here::here("ch06_basic_geospatial_viz",
"data","gis","shp",
"sle_adm3.shp"),
quiet = TRUE) %>%
filter(admin1Name=="Western")
ggplot(data = sierra_leone_west) +
geom_sf()This time, for all the districts from the province retained, we
compute their area using st_area() from the package
sf:
sierra_leone_shp <- sierra_leone_west %>%
# calculate area of each polygon
mutate(area_m2 = st_area(.)) %>%
# convert m2 to km2
mutate(area_km2 = units::set_units(.$area_m2,km^2)) %>%
# transform the variable to numeric
mutate(area_km2 = as.numeric(area_km2))
sierra_leone_shp %>%
select(area_km2)## Simple feature collection with 12 features and 1 field
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -13.29894 ymin: 8.094272 xmax: -12.91333 ymax: 8.499809
## Geodetic CRS: WGS 84
## # A tibble: 12 × 2
## area_km2 geometry
## <dbl> <MULTIPOLYGON [°]>
## 1 167. (((-13.02082 8.381989, -13.0188 8.377952, -13.01781 8.377952, -13.01603 8…
## 2 38.9 (((-13.21496 8.474341, -13.21479 8.474289, -13.21465 8.474296, -13.21455 …
## 3 167. (((-13.12215 8.413156, -13.12155 8.413105, -13.12013 8.413218, -13.11941 …
## 4 243. (((-13.24441 8.095223, -13.24574 8.094272, -13.24742 8.094518, -13.2491 8…
## 5 2.30 (((-13.22646 8.489716, -13.22648 8.48955, -13.22644 8.489513, -13.22663 8…
## 6 1.75 (((-13.2129 8.494033, -13.21076 8.494026, -13.21013 8.494041, -13.2096 8.…
## 7 1.83 (((-13.22653 8.491883, -13.22647 8.491853, -13.22642 8.49186, -13.22633 8…
## 8 0.796 (((-13.23154 8.491768, -13.23141 8.491566, -13.23144 8.49146, -13.23131 8…
## 9 20.8 (((-13.28529 8.497354, -13.28456 8.496497, -13.28403 8.49621, -13.28338 8…
## 10 2.23 (((-13.24677 8.493453, -13.24669 8.493285, -13.2464 8.493132, -13.24627 8…
## 11 6.69 (((-13.25698 8.485518, -13.25685 8.485501, -13.25668 8.485505, -13.25657 …
## 12 37.9 (((-13.20465 8.485758, -13.20461 8.485698, -13.20449 8.485757, -13.20431 …
We can now fill in the districts using their area to
visually identify the largest counties. For this, we use the {colorspace}
package with some transparency. In this case, area_km2
is a continuous variable, with a sequential scale
type:
sierra_leone_shp %>%
ggplot() +
# geometry
geom_sf(aes(fill = area_km2)) +
# aesthetic
colorspace::scale_fill_continuous_sequential(palette="Reds 3", alpha = 0.8)The scales are called via the scheme
scale_<aesthetic>_<datatype>_<colorscale>()
where:
<aesthetic> is the name of the aesthetic
(fill, color, colour).<datatype> is the type of the variable
plotted (discrete, continuous, binned).<colorscale> sets the type of the color
scale used (qualitative, sequential, diverging,
divergingx).This is a ggplot map with the afriairports object:
afriairports %>%
ggplot() +
geom_sf(aes(color = elevation_ft))Paste this code to your answer and:
Update the color aesthetic of the continuous
variable elevation_ft of this map to a diverging
scale, with a midpoint (mid) in 5000.
A Sequential colors scale indicate:
A Diverging color scale allows to:
You can use {colorspace} package to:
{ggspatial}We introduce here the package {ggspatial}, which
provides easy-to-use functions to create a scale bar and north arrow on
a ggplot map:
annotation_north_arrow() allows to add the north symbol
andannotation_scale() a scale bar.sierra_leone_shp %>%
ggplot() +
# geometry
geom_sf(aes(fill = area_km2)) +
# aesthetic
colorspace::scale_fill_continuous_sequential(palette="Reds 3", alpha = 0.8) +
annotation_north_arrow(location="tr") +
annotation_scale(location="br")The location of the scale bar and north arrow are by
default in the bottom left ("bl") side of the map. They can
be specified using the location argument with
"tr" for top right, "bl" for bottom left,
etc.
This is a ggplot map with the zimbabwe_adm1 object:
zimbabwe_adm1 %>%
ggplot() +
geom_sf()Paste this code to your answer and:
Add a Scale bar located in the bottom right of the map,
and a North arrow in the top left.
The North
arrow style in annotation_north_arrow() can also be
adjusted using the style argument.
Note that scale
distance is set to "km" by default in
annotation_scale(); you can set it in “m”, “cm”, “mi”,
“ft”, or “in”.
geom_sf_text_repel()To make a more complete map of the Western province of Sierra Leona,
Health facilities (sf object: point data) will be added to
the map.
Instead of looking up coordinates manually, the package
{afrihealthsites} provides a function
afrihealthsites(), which allows to retrieve geographic
coordinates of African health facilities from different sources:
You can run the following code to retrieve geographic coordinates of
all the health facilities in Sierra Leone available in the
who database:
sle_healthsites_all <- afrihealthsites(country = "Sierra Leone",
datasource='who',
plot = FALSE,
returnclass = "dataframe") %>%
janitor::clean_names()
sle_healthsites_all## # A tibble: 1,120 × 12
## country admin1 facility_name facility_type ownership lat long ll_source iso3c
## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <chr> <chr>
## 1 Sierra Le… Easte… Ahmadiyya Mi… Mission Hosp… FBO 7.89 -11.2 GPS SLE
## 2 Sierra Le… Easte… Baama Commun… Community He… MoH 8.36 -11.3 GPS SLE
## 3 Sierra Le… Easte… Baiama Commu… Community He… MoH 8.52 -11.0 GPS SLE
## 4 Sierra Le… Easte… Baiima Commu… Community He… MoH 8.03 -10.8 GPS SLE
## 5 Sierra Le… Easte… Baiwala Comm… Community He… MoH 8.00 -10.6 GPS SLE
## 6 Sierra Le… Easte… Bambara Kaim… Community He… MoH 8.48 -11.3 GPS SLE
## 7 Sierra Le… Easte… Bambara Mate… Maternal & C… MoH 8.32 -11.3 GPS SLE
## 8 Sierra Le… Easte… Bambawulo Co… Community He… MoH 8.01 -11.1 GPS SLE
## 9 Sierra Le… Easte… Bandajuma Co… Community He… MoH 8.31 -10.8 GPS SLE
## 10 Sierra Le… Easte… Bandajuma Kp… Community He… MoH 8.02 -10.9 GPS SLE
## # … with 1,110 more rows, and 3 more variables: facility_type_9 <chr>, tier <dbl>,
## # tier_name <chr>
Access to all the health facilities of Zimbabwe from the
healthsites data source using the
afrihealthsites() function.
According to the three-tier health delivery classification, the highest tier (Tier 3) includes county hospitals in rural regions and city hospitals in urban regions, which are responsible for most inpatient services as well as teaching and research missions (Wang, 2021).}
We can now keep the Tier 3 health facilities inside the Western
province, and convert the data frame with coordinates to sf
format:
sle_healthsites_set <- sle_healthsites_all %>%
filter(admin1=="Western Area") %>%
filter(tier==3) %>%
st_as_sf(coords = c("long","lat"),
crs = 4326)We add hospital locations and names as text on the map with
geom_sf_text():
sle_healthsites_set %>%
ggplot() +
# geometry
geom_sf() +
# label
geom_sf_text(mapping = aes(label= facility_name))This is not really satisfactory, as the names overlap on the points,
and they are not easy to read on the grey background. The package
{ggsflabel} offers a very flexible approach (inspired
in {ggrepel}) to deal with label placement in
ggplot maps with sf objects (with
geom_sf_text_repel and geom_sf_label_repel),
including automated movement of labels in case of overlap.
We use it here to “nudge” the labels away, and connect them to the city locations:
sle_healthsites_set %>%
ggplot() +
# geometry
geom_sf() +
# label
geom_sf_text_repel(mapping = aes(label= facility_name))This is a ggplot map of all the hospital facilities of
Zimbabwe:
afrihealthsites(country = 'Zimbabwe',
datasource='healthsites',
plot = FALSE,
returnclass = 'dataframe') %>%
filter(amenity == 'hospital') %>%
ggplot() +
geom_sf()Paste this code to your answer and:
Add their names as text without overlaps to a ggplot
map.
Additionally, we can manually set {ggrepel}
arguments to improve its output:
size);fontface);force);box.padding).sle_healthsites_set %>%
ggplot() +
# geometry
geom_sf() +
# label
geom_sf_text_repel(mapping = aes(label= facility_name),
size = 3,
fontface = "bold",
force = 40,
box.padding = 0.6)For the final map, we put everything together, having a general background map based on the Sierra Leona map, with district delineations, province labels, main hospital names and locations, custom GPS collected field sites, as well as a theme adjusted with axis labels, and a north arrow and scale bar:
ggplot() +
## geometries
# background map
geom_sf(data = sierra_leone) +
# district polygons filled by area
geom_sf(data = sierra_leone_shp,
mapping = aes(fill = area_km2)) +
# hospital points
geom_sf(data = sle_healthsites_set) +
# field site points
geom_sf(data = sites_sf,
mapping = aes(color = gps_name),
shape = 18,
size = 4) +
## labels
# hospital names with repelled text
geom_sf_text_repel(data = sle_healthsites_set,
mapping = aes(label = facility_name),
size = 2,
fontface = "bold",
box.padding = 0.6,
force = 0.5,
nudge_x = -0.25,
direction = "y",
hjust = 1,
segment.size = 0.2) +
# province names with repelled labels
geom_sf_label_repel(data = sierra_leone %>%
filter(shapeName!="Eastern"),
mapping = aes(label=shapeName)) +
## aesthetics
# alternative color scale for fill
colorspace::scale_fill_continuous_sequential(palette="Reds 3",
alpha = 0.8) +
# map annotation
annotation_north_arrow(location="tl") +
annotation_scale(location="bl") +
# map extent
coord_sf(xlim = c(-13.5,-12.7),
ylim = c(8.0,8.7)) +
# ggplot labels
labs(x = "Longitude",
y = "Latitude",
fill = expression(Area~km^2),
color = "GPS data",
title = "How are Hospitals distributed in the Western Province of Sierra Leone?")This example fully demonstrates that adding layers
on ggplot2 is relatively straightforward, as long as the
data is properly stored in an sf object. Adding additional
layers like external data, color
palettes, point or polygon labels and
map annotations would simply follow the same logic,
with additional calls after geom_sf() and at the
right place in the ggplot2 sequence.
The following team members contributed to this lesson:
Some material in this lesson was adapted from the following sources:
Moreno, M., Basille, M. Drawing beautiful maps programmatically with R, sf and ggplot2 — Part 1: Basics. (2018). Retrieved 01 June 2022, from https://r-spatial.org/r/2018/10/25/ggplot2-sf.html
Moreno, M., Basille, M. Drawing beautiful maps programmatically with R, sf and ggplot2 — Part 2: Layers. (2018). Retrieved 01 June 2022, from https://r-spatial.org/r/2018/10/25/ggplot2-sf-2.html
Wilke, Claus O. Fundamentals of Data Visualization. Chapter 4: Color scales. (2020). Retrieved 01 June 2022, from https://clauswilke.com/dataviz/color-basics.html#color-to-represent-data-values